CHAPTER 6 Taking All Kinds of Samples 81
This process ensures that your sample of 20 patients was taken completely at
random. Statistical packages like those described in Chapter 4 have RNG com-
mands similar to the one in Excel.
Learners sometimes think that as long as they sort a spreadsheet of data by a col-
umn containing any value and then select a sample of rows from the top, that they
have automatically obtained an SRS. This is not correct! If you think about it more
carefully, you will realize why. If you sort names alphabetically, you will see pat-
terns in names (such as religious names, or names associated with certain lan-
guages, countries, or ethnicities). If you sort by another identifying column, such
as email address or city of residence, you will again see patterns in the data. If you
attempt to take an SRS from such data, it will be biased, not random, and not be
representative. That is why it is important to use a column with an RNG in it for
sorting if you are taking an SRS electronically.
Taking an SRS intuitively seems like the optimal way to draw a representative
sample. However, there are caveats. In the previous example, you started with a
clinical population in the form of a printed or electronic list of patients from which
you could draw a sample. But what if you want to sample from patients presenting
to the emergency department during a particular period of time in the future?
Such a list does not exist. In a situation like that, you could use systematic sam-
pling, which is explained later in the section “Engaging in systematic
sampling.”
Another caveat of SRS is that it can miss important subgroups. Imagine that in
your list of clinic patients, only 10 percent were pediatric patients (defined as
patients under the age of 18 years). Because 10 percent of 20 is two, you may
expect that a random sample of 20 patients from a population where 10 percent
are pediatric would include two pediatric patients. But in practice, in a situation
like this, it would not be unusual for an SRS of 20 patients to include zero pediatric
patients. If your SRS needs to ensure representation by certain subgroups, then
you should consider using stratified sampling instead.
Taking a stratified sample
In the previous section, we discussed a scenario where 10 percent of the patients
of a clinic are pediatric patients, and taking a sample of 20 using an SRS from a list
of the clinic population runs the risk of not including any pediatric patients. If
pediatric patients were important to the study, then this problem can be solved
with stratified sampling. The word stratum refers to a layer (as you see in a layer
cake), and the word strata is the plural of stratum. Stratified sampling can be seen
as sampling from strata, or layers.